The Integrated Language Database of 8th - 21st-Century Dutch

نویسنده

  • J. G. Kruyt
چکیده

The Institute for Dutch Lexicology (INL) has a long-standing tradition in corpus-based lexicography. The results include electronic scholarly dictionaries of Dutch covering the vocabulary from 1200 up to 1976, linguistically annotated electronic text corpora of historical and present-day Dutch, and computational lexica. Added value to these data is given in an on-going long-term INL project, the Integrated Language Database of 8th–21st-Century Dutch (ILD). The aim is to create a flexible linguistic research instrument by linking the dictionaries, a balanced diachronic text corpus and lexica of historical and present-day Dutch. We will link part of our data with data collections stored at other institutes, creating a supra-institutional research instrument. The paper reports on the overall ILD design and the user's perspective. Focus is on the ILD prototype which, when finished, will function as a demonstration model to verify and assess user needs. It now functions to test the design empirically for its applicability to 'real data', as well as to obtain figures on workload, etc. The conclusion is that the latter function proved the prototype to be an indispensable pilot for the ILD.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation and Evaluation of PAROLE PoS in a National Context

We are annotating the complete 20 million Dutch PAROLE corpus with PoS and lemma. The morphosyntactic tagging of 250,000 words during the PAROLE project was the first confrontation of the fine-grained Dutch PAROLE tagset and its ’functional’ mode of application, with real corpus data. The correction of the manual tagging and the compilation of a 100,000 words training corpus for the automatic t...

متن کامل

Pronunciation Barriers and Computer Assisted Language Learning (CALL): Coping the Demands of 21st Century in Second Language Learning Classroom in Pakistan

Pronunciation of English language is a very important sub-skill of speaking module in second language learning process. However, it is ignored, neglected, and even never gotten least attention by the teachers, administrators, and stakeholders especially in Pakistan. Grammar, vocabulary, and the other linguistic skills such as reading and writing are emphasized whereas pronunciation has never be...

متن کامل

Language planning

Language planning, in one way or another, is as old as human civilization. Every time that one polity  invaded  the  territory  of  another,  the  language  of  the  conqueror  was  imposed  on  the conquered. The Romans imposed their language across the civilized world as they  knew it. In the  21st  century,  the  practice  of  language  planning  has  become  increasingly  sophisticated. Eng...

متن کامل

An Exploration of Language Identification Techniques for the Dutch Folktale Database

The Dutch Folktale Database contains fairy tales, traditional legends, urban legends, and jokes written in a large variety and combination of languages including (Middle and 17th century) Dutch, Frisian and a number of Dutch dialects. In this work we compare a number of approaches to automatic language identification for this collection. We show that in comparison to typical language identifica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004